Data feature selection based on Artificial Bee Colony algorithm

نویسندگان

  • Mauricio Schiezaro
  • Hélio Pedrini
چکیده

Classification of data in large repositories requires efficient techniques for analysis since a large amount of features is created for better representation of such images. Optimization methods can be used in the process of feature selection to determine the most relevant subset of features from the data set while maintaining adequate accuracy rate represented by the original set of features. Several bioinspired algorithms, that is, based on the behavior of living beings of nature, have been proposed in the literature with the objective of solving optimization problems. This paper aims at investigating, implementing, and analyzing a feature selection method using the Artificial Bee Colony approach to classification of different data sets. Various UCI data sets have been used to demonstrate the effectiveness of the proposed method against other relevant approaches available in the literature. Introduction Data analysis aims at extracting andmodeling information content to identify patterns within the data. As a manner of simplifying the amount of information to describe a large set of data, features are extracted from the data, serving as representative characteristics of its contents. In image analysis, for instance, examples of features include color, texture, edges, object shape, interest points, among others. These features usually are organized into an n-dimensional feature vector. Feature selection is an important step used in several tasks, such as image classification, cluster analysis, data mining, pattern recognition, image retrieval, among others. It is a crucial preprocessing technique for effective data analysis, where only a subset from the original data features is chosen to eliminate noisy, irrelevant or redundant features. This task allows to reduce computational cost and improve accuracy of the data analysis process. This paper proposes a feature selection method for data analysis based on Artificial Bee Colony (ABC) approach that can be used in several knowledge domains through wrapper and forward strategies. The ABC method has been widely used to solve optimization problems; however, there have been few works on feature selection. Our work proposes a binary version of the ABC algorithm, *Correspondence: [email protected] Institute of Computing, University of Campinas, Campinas, São Paulo 13083-852, Brazil where the number of new features to be analyzed in a neighborhood of a food source is determined through a perturbation parameter proposed by Karaboga and Akay [1]. The method is analyzed and compared to other relevant approaches available in the literature. Experimental results showed that a reduced number of features can achieve classification accuracy superior than that using the full set of features. The accuracy has significantly increased even though the number of selected features has drastically reduced. Furthermore, the proposed method presented better results for the majority of the tested data sets compared to other algorithms. The paper is organized as follows: Initially, some relevant concepts and work related to feature selection are described. The proposed methodology for feature selection is then presented in detail. Experimental results obtained through the application of the proposed method to several data sets are described and discussed. Finally, the remaining section concludes the paper with final remarks and directions for future work. Related concepts and work The process of feature selection is responsible for electing a subset of features, which can be described as a search into a state space. One can perform a full search in which all the spaces are traversed; however, this approach is impractical for a large number of features. A heuris© 2013 Schiezaro and Pedrini; licensee Springer. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Schiezaro and Pedrini EURASIP Journal on Image and Video Processing 2013, 2013:47 Page 2 of 8 http://jivp.eurasipjournals.com/content/2013/1/47 tic search considers the features, not yet selected at each iteration, for evaluation. A random search generates random subsets within the search space, such that several bioinspired and genetic algorithms use this approach [2]. Feature selection can be described as a search into a space of states, and according to the initialization and behavior during the search steps, we can divide the search into three different approaches [3]: forward: the feature subset is initialized empty and features are included in the subset during the feature selection; backward: the feature subset is initialized with a full set of features and the features are excluded from the subset during the feature selection process; bidirectional: features can be inserted or excluded during the feature selection process. Feature selection methods can be classified into two main categories: filter approaches [4-9] and wrapper approaches [10-14]. In filter approaches, a filtering process is performed before the classification process; therefore, they are independent of the used classification algorithm [15]. A weight value is computed for each feature, such that those features with better weight values are selected to represent the original data set. On the other hand, wrapper approaches generate a set of candidate features by adding and removing features to compose a subset of features. Then, they employ accuracy to evaluate the resulting feature set. Wrapper methods usually achieve superior results than filter methods. Many evolutionary algorithms have been used for feature selection, which include genetic algorithms and swarm algorithms [16]. Swarm algorithms include, in turn, Ant Colony Optimization (ACO) [5,17,18], Particle SwarmOptimization (PSO) [19], Bat Algorithm (BAT) [2], and Artificial Bee Colony [1,20-22]. The use of Swarm Intelligence for feature selection has increased in the last years. Suguna and Thanushkodi [23] proposed a rough set approach with ABC algorithm for dimensionality reduction using different medical data sets in the area of Dermatology for tests, whereas Shokouhifar and Sabet [24] employed the same algorithm (ABC) for feature selection using neural networks. Particle Swarm Optimization has been proposed for feature selection either as filter method [15] or as wrapper method [25-27]. Nakamura et al. [2] proposed a wrapper method using a BAT algorithmwith OPF classifier. Among feature selection approaches to Ant Colony Optimization, we can highlight the ACO for image feature selection proposed by Chen et al. [28]. The Artificial Bee Colony is a Swarm Intelligent algorithm used to solve optimization problems in several research areas [29-33]. It was proposed by Karaboga [20] in 2005, based on forage for honeybees. Frisch [34], Frisch and Lindauer [35], and Seeley [36] have investigated the foraging behavior of bees, external information (odor, location information in waggle dance, presence of other bees in the food source or between the hive and source), and internal information (source location and source odor). The process starts when bees leave the hive of a forage to search for a food source (nectar). After finding nectar, the bees store it in their stomach. After coming back to the hive, the bees unload the nectar and perform a waggle dance to share their information about the food source (nectar quantity, distance and direction from black the hive) and recruit new bees for exploring most rich food sources [37]. The minimum model of ABC to emerge a collective intelligence of bee swarm consists of three components: food sources, employed bees, and unemployed bees [38], which are described as follows: • Food sources: each food source represents a probable solution to the problem. • Employed bees: employed bees find a food source, store information about its quality, and share this information with other bees in the honeycomb. The number of food source and that of employed bees are the same. • Unemployed bees: unemployed bees can be of two types: onlooker bees or scout bees. – Onlooker bees: onlooker bees receive information from employed bees about the quality of food sources and choose food sources with better quality to explore the neighborhood. At the moment that onlooker bees choose a food source to explore, they become employed bees. – Scout bees: employed bees become scout bees when a food source is exhausted. In other words, the employed bees explored a food source neighborhood MAX LIMIT times; however, they did not find any food source with better quality. Scout bees try to find new food sources. A general pseudocode for the ABC optimization approach [22] is shown in Algorithm 1. Algorithm 1 ABC optimization approach 1: Initialization Phase 2: repeat 3: Employed Bee Phase 4: Onlooker Bee Phase 5: Scout Bee Phase 6: Memorize the best solution achieved so far 7: until (Cycle = Maximum Cycle Number or a Maximum CPU time) Schiezaro and Pedrini EURASIP Journal on Image and Video Processing 2013, 2013:47 Page 3 of 8 http://jivp.eurasipjournals.com/content/2013/1/47 Initialization phase The original algorithm [1] proposes a random creation of food sources, such that each one of them corresponds to a possible solution to the problem xij = xmin j + rand(0, 1)(xmax j − xmin j ) (1) where i = 1, . . . , N , j = 1, . . . ,D, such that N is the number of food sources and D is the number of optimization parameters. Employed bee phase Each employed bee will explore the neighborhood of the food sources associated to them. The neighborhood exploration is defined as vij = xij + ij(xij − xkj). (2) For each food source, xi, a food source vi is determined through the modification of an optimization parameter j, that is, xij is modified. Indices j and k are random variables. The value of k is at the range 1, 2 . . . , SN and must be different from i. ij is a real number between−1 and 1. Once vi is produced, the fitness value of the food source is obtained by fitnessi = { 1 1+fi , if fi ≥ 0 1+ abs( fi), if fi < 0 (3) where fi is a cost function. For maximization problems, the cost function can be directly used as a fitness value. After all employed bees have conducted their search, they share the information about the quality of the food source with the onlooker bees. The probability of an onlooker bee to choose a food source to be explored is associated to its fitness, that is, pi = fitnessi F ∑ n=1 fitnessi . (4) Through the values of exploration probabilities, the food sources are selected by the onlooker bees. Onlooker bee phase The food sources with better probability to be explored are selected by the onlooker bees, which become the employed bees. The neighborhood of the selected food sources are explored as explained in the ‘Employed bee phase’ subsection. Scout bee phase The algorithm checks to see if there is any exhausted source to be abandoned. In order to decide if a source is to be abandoned, the LIMIT variable which has been updated during search is used. If the value of the LIMIT is greater than that of the MAX LIMIT, then the food source is assumed to be exhausted and is abandoned. The food source abandoned by its bee is replaced with a new food source discovered by the scout. The new food source associated with the scout bee is created randomly. Artificial Bee Colony algorithm for feature selection Unlike optimization problems, where the possible solutions to the problem can be represented by vectors with real values, the candidate solutions to the feature selection problem are represented by bit vectors. Each food source is associated with a bit vector of size N, where N is the total number of features. The position in the vector corresponds to the number of features to be evaluated. If the value at the corresponding position is 1, this indicates that the feature is part of the subset to be evaluated. On the other hand, if the value is 0, it indicates that the feature is not part of the subset to be assessed. Additionally, each food source stores its quality (fitness), which is given by the accuracy of the classifier using the feature subset indicated by the bit vector. The main steps of the proposed feature selection method are illustrated in Figure 1. Each step is described as follows: 1. Create initial food sources: for feature selection, it is desirable to search for the best accuracy using the lowest possible number of features. For this reason, the proposed method follows the forward search strategy. The algorithm is initialized with N food sources, where N is the total number of features. Each food source is initialized with a bit vector of size N, where only one feature will be presented in the feature subset, that is, only one position of the vector will be filled with 1. 2. Submit a feature subset of food sources to the classifier and use accuracy as fitness: the feature subset of each food source is submitted to the classifier, and accuracy is stored as the fitness of food source. 3. Determine neighbors of chosen food sources by employed bees using modification rate (MR) parameter: each employed bee visits a food source and explores its neighborhood. For feature selection, a neighbor is created from the bit vector of the original food source. In the basic version of ABC algorithm, the neighborhood is defined by performing a small perturbation in only an optimization parameter through Equation 2, which makes convergence slower. In the feature selection, the optimization parameters are represented by the bit vectors and their perturbation is performed by a Schiezaro and Pedrini EURASIP Journal on Image and Video Processing 2013, 2013:47 Page 4 of 8 http://jivp.eurasipjournals.com/content/2013/1/47 Figure 1 Steps of ABC feature selection. Diagram with the main steps of the proposed ABC feature selection method. perturbation frequency or MR [1]. For each position of the bit vector or feature, a random and uniform number Ri is generated in the range between 0 and 1. If this value is lower than the perturbation parameter MR, the feature is inserted into the subset, that is, the vector value at that position is filled with 1. Otherwise, the value of the b it vector is not modified. This is expressed in Equation 5 : Schiezaro and Pedrini EURASIP Journal on Image and Video Processing 2013, 2013:47 Page 5 of 8 http://jivp.eurasipjournals.com/content/2013/1/47 xi = { 1 if Ri < MR xi otherwise (5) where xi is the position i in the bit vector. 4. Submit a feature subset of neighbors to the classifier and use accuracy as fitness: the feature subset created for each neighbor is submitted to the classifier, and accuracy is stored as the neighbor’s fitness. 5. Fitness of neighbor is better?: if the food source quality of the newly created neighbor is better than the food source under exploration, then the neighbor food source is considered as a new one and information about its quality will be shared with other bees. Otherwise, variable LIMIT, from the food source where the neighborhood is being explored, is incremented. If the value of LIMIT is greater than that of MAX LIMIT, then the food source is abandoned, that is, the food source is exhausted. In other words, the employed bees explored a food source neighborhood MAX LIMIT times; however, they did not find any food source with better quality, such that it is not worthwhile following a way where all food sources around it have worse quality than the current source. For each abandoned source, the method creates a scout bee to randomly search a new food source. The mechanism of search is illustrated in Figure 2. 6. All onlookers are distributed?: onlooker bees collect information about the fitness of food sources visited by employed bees and choose food sources with either better probability of exploration or better fitness. At the moment that onlooker bees choose the food source to be explored, they become employed bees and execute step 3. 7. Memorize the best food source: after all onlookers have been distributed, the food source with the best fitness is stored. 8. Find abandoned food sources and produce new scout bees: for each abandoned food source, a scout bee is created and a new food source is generated, where a bit vector with size N of features is randomly created and submitted to the classifier, and accuracy is stored. The new food source is assigned to scout bees, and then they become employed bees and execute step 3. Experimental results This section describes the data sets tested in our experiments, the computational resources used to implement and evaluate the proposed feature selection method, the strategies adopted in the data classification, the ABC parameters, as well as a discussion of the experimental results. Data sets The proposed method has been evaluated through ten data sets from different knowledge fields. The data sets are available fromUCIMachine Learning Repository [39]. Table 1 presents a description of the tested data sets, including the number of instances, number of features, and number of classes for each data set. UCI data sets have been widely used in the evaluation of data classification since they contain a varied number of features and classes, allowing the analysis of influence on accuracy and performance when features are selected (Table 2). Comparison against other methods The proposed method was compared to some relevant swarm approaches: ACO, PSO, and genetic algorithms (GAs) (Table 3). Computational environment All the experiments have been conducted on a computer with Intel Core I7-2600 3.4 GHz and 4-GB RAM. The Artificial Bee Colony feature selection algorithm Figure 2 Searchmechanism. Diagram with search mechanism and exploration of neighborhood of the ABC feature selection method. Schiezaro and Pedrini EURASIP Journal on Image and Video Processing 2013, 2013:47 Page 6 of 8 http://jivp.eurasipjournals.com/content/2013/1/47 Table 1 Summary of UCI data sets Data set Number Number Number of instances of features of classes Image Segmentation 2,310 19 7

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BeeID: intrusion detection in AODV-based MANETs using artificial Bee colony and negative selection algorithms

Mobile ad hoc networks (MANETs) are multi-hop wireless networks of mobile nodes constructed dynamically without the use of any fixed network infrastructure. Due to inherent characteristics of these networks, malicious nodes can easily disrupt the routing process. A traditional approach to detect such malicious network activities is to build a profile of the normal network traffic, and then iden...

متن کامل

Feature Selection with Chaotic Hybrid Artificial Bee Colony Algorithm based on Fuzzy (CHABCF)

Feature selection plays an important role in data mining and pattern recognition, especially in the case of large scale data. Feature selection is done due to large amount of noise and irrelevant features in the original data set. Hence, the efficiency of learning algorithms will increase incredibly if these irrelevant data are removed by this procedure. A novel approach for feature selection i...

متن کامل

A Novel Discrete Artificial Bee Colony Algorithm for Rough Set-based Feature Selection

Feature selection plays an important role in the fields of pattern recognition, data mining and machine learning. Rough set method is one of effective methods for feature selection, which can preserve the meaning of the features. Presently ant colony optimization (ACO) has been successfully applied to rough set-based feature selection, however, it has the limitations of many control parameters,...

متن کامل

Elite Opposition-based Artificial Bee Colony Algorithm for Global Optimization

 Numerous problems in engineering and science can be converted into optimization problems. Artificial bee colony (ABC) algorithm is a newly developed stochastic optimization algorithm and has been widely used in many areas. However, due to the stochastic characteristics of its solution search equation, the traditional ABC algorithm often suffers from poor exploitation. Aiming at this weakness o...

متن کامل

Artificial Bee Colony based Feature Selection for Effective Cardiovascular Disease Diagnosis

Machine learning has been an effective support system in medical diagnosis which involve large amount of data. Analyzing such data consumes more time in terms of execution and resources. All data features do not support for the end results. Hence it is very important to identify the features that contribute more in identifying the diseases. Those with less contribution can be eliminated. The ne...

متن کامل

BQIABC: A new Quantum-Inspired Artificial Bee Colony Algorithm for Binary Optimization Problems

Artificial bee colony (ABC) algorithm is a swarm intelligence optimization algorithm inspired by the intelligent behavior of honey bees when searching for food sources. The various versions of the ABC algorithm have been widely used to solve continuous and discrete optimization problems in different fields. In this paper a new binary version of the ABC algorithm inspired by quantum computing, c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • EURASIP J. Image and Video Processing

دوره 2013  شماره 

صفحات  -

تاریخ انتشار 2013